Bloom Daddy’s Sales Exploration

a DATA DRUID deep dive

Jake Weber

jakeweber.io

1 Introduction

Prepared this Sales Analysis for Bloom Daddy, a purveyor of rare, exotic houseplants.

When I was consulting with him, he seemed keen to understand a few facets of the data provided by Etsy regarding his online business. At a high level, we came up with the following questions.

  1. How are his sales doing, both recent and historically?
  2. What products drove the most revenue?
  3. Can we create a marketing strategy based on consumer behavior?

Below you’ll find a report developed to answer some of these questions. We dive into recent numbers, look at revenue over time by product, and thenn dive into an RFM (Recency, Frequency, Monetary) Analysis to segment his consumer base and tailor strategies to these segments.

1.1 Packages & Functions

#packages ----

#Workhorse
library(tidyverse)
library(lubridate)

#Import & Export
library(readxl)
library(writexl)
library(readr)

#Formatting & Visualization
library(ggdist)
library(ggrepel)
library(tidyquant)
library(hrbrthemes)
library(kableExtra)
library(viridisLite)
library(scales)
library(DT) 


detect_na <- function(data) {
    
fdat1 <- data %>%
    summarise_all(~ sum(!is.na(.)))
    
fdat2 <- data %>%
    summarise_all(~ sum(is.na(.)))
    
fdat3 <- data %>% 
    summarise_all(~ sum(is.na(.)) / length(.))
    
    fdat1_2 <-  fdat1 %>% 
        pivot_longer(everything(), names_to = "column_names", values_to = "non_NULL")
    
    fdat2_2 <- fdat2 %>% 
        pivot_longer(everything(), names_to = "column_names", values_to = "NULL")
    
    fdat3_2 <- fdat3 %>% 
        pivot_longer(everything(), names_to = "column_names", values_to = "percent_NULL")
    
    fdat1_2 %>% 
        left_join(fdat2_2, by = c("column_names")) %>% 
        left_join(fdat3_2, by = c("column_names")) %>% 
        arrange(desc(percent_NULL)) %>% 
        kbl(align = "l", format.args = list(big.mark = ",")) %>% 
        kable_styling(
           full_width = F,
           bootstrap_options = c("hover", "responsive", "striped"))

}

tablekable <- function(data) {
     data %>% 
         kbl(align = "l") %>% 
         kable_styling(
            full_width = F,
            bootstrap_options = c("hover", "responsive", "striped"))
}

tabledata <- function(data) {
    data %>% 
        datatable(filter = "bottom", style = "bootstrap5")
}

slicedt <- function(data) {
    data %>% 
        slice(1, 
            floor(nrow(.) * 0.2),
            floor(nrow(.) * 0.4),
            floor(nrow(.) * 0.6),
            floor(nrow(.) * 0.8),
            nrow(.)
        ) %>% 
        datatable(filter = "bottom", style = "bootstrap5")
}

1.2 Import Data

dat_import <- read_excel("data/etsy + item attributes.xlsx", 
    sheet = "dat")

dat_import %>% glimpse()
## Rows: 2,774
## Columns: 39
## $ `Sale Date`         <dttm> 2021-12-29, 2021-12-20, 2021-12-14, 2021-12-10, 2…
## $ `Item Group Old`    <chr> "philodendron", "philodendron", "anthurium", "phil…
## $ `Item Name Old`     <chr> "philodendron florida ghost", "philodendron pink p…
## $ `Item Group`        <chr> "Tarzan", "Tarzan", "Korok", "Tarzan", "Heat Pack"…
## $ `Item Name`         <chr> "Tornado", "Hyrulian Princess", "Hope Korok", "Hyr…
## $ `Item Maturity`     <chr> "s", "s", NA, "s", NA, NA, "s", "s", "s", "s", NA,…
## $ `Item Variegation`  <chr> NA, "high", NA, "high", NA, NA, "high", "high", NA…
## $ `Ship Name`         <chr> "Daniel James Hurt", "Katy Hay", "Glenn Sato", "An…
## $ `Buyer ID`          <chr> "Decker", "Deadshadow", "Dawnreaper", "Dayspear", …
## $ Buyer               <chr> "Daniel Hurt (danieljhurt)", "Freud Who (Freud2)",…
## $ Quantity            <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ Price               <dbl> 55.0, 69.0, 135.0, 110.0, 6.0, 549.0, 69.0, 69.0, …
## $ `Coupon Code`       <chr> NA, NA, NA, NA, "cyber2021;cyber2021", "cyber2021;…
## $ `Coupon Details`    <chr> NA, NA, NA, NA, "cyber2021 - % off;cyber2021 - % o…
## $ `Discount Amount`   <dbl> 0.00, 0.00, 0.00, 0.00, 55.50, 0.00, 6.90, 6.90, 3…
## $ `Shipping Discount` <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ `Order Shipping`    <dbl> 11.5, 11.5, 11.5, 11.5, 0.0, 0.0, 11.5, 11.5, 11.5…
## $ `Order Sales Tax`   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ `Item Total`        <dbl> 55.0, 69.0, 135.0, 110.0, 6.0, 549.0, 69.0, 69.0, …
## $ Currency            <chr> "USD", "USD", "USD", "USD", "USD", "USD", "USD", "…
## $ `Transaction ID`    <dbl> 2814238693, 2803868757, 2790955322, 2781295834, 27…
## $ `Listing ID`        <dbl> 1084945150, 1022782347, 1098887289, 1138806105, 96…
## $ `Date Paid`         <dttm> 2021-12-29, 2021-12-20, 2021-12-14, 2021-12-10, 2…
## $ `Date Shipped`      <dttm> 2022-01-06, 2021-12-30, 2021-12-19, 2021-12-19, 2…
## $ `Ship Address1`     <chr> "1520 melody ln", "8507 N Capital Of Texas Hwy", "…
## $ `Ship Address2`     <chr> NA, "Apt 3013", NA, "#55", NA, NA, NA, NA, NA, NA,…
## $ `Ship City`         <chr> "Fullerton", "Austin", "Honolulu", "Cambridge", "B…
## $ `Ship State`        <chr> "CA", "TX", "HI", "MA", "WA", "WA", "HI", "CT", "F…
## $ `Ship Zipcode`      <chr> "92831", "78759-7997", "96815", "2139", "98010", "…
## $ `Ship Country`      <chr> "United States", "United States", "United States",…
## $ `Order ID`          <dbl> 2327112914, 2318304177, 2309507871, 2301819693, 22…
## $ Variations          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ `Order Type`        <chr> "online", "online", "online", "online", "online", …
## $ `Listings Type`     <chr> "listing", "listing", "listing", "listing", "listi…
## $ `Payment Type`      <chr> "online_cc", "online_cc", "online_cc", "online_cc"…
## $ `InPerson Discount` <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ `InPerson Location` <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ `VAT Paid by Buyer` <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ SKU                 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…

1.3 Tidy Data

dat_tidy <- dat_import %>%    #rename columns
    rename_with(tolower) %>% 
    rename_with(~ str_replace_all(
        string = .,
        pattern = " ",
        replacement =  "_")) %>%     
    rename(customer_name = buyer_id) %>%   #remove extra columns
    select(
        sale_date, customer_name, order_id,
        item_group, item_name,  quantity, price, 
        ship_city, ship_state, ship_zipcode, ship_country
    ) %>% 
    mutate(sale_date = lubridate::as_date(sale_date),  #manage data types
           order_id = as.character(order_id)) %>% 
    filter(item_group != "heat pack") %>%   #remove item not needing analysis
    mutate_if(is.character, str_to_title) %>%    #formatting
    mutate_at(c("quantity", "price"), as.integer)


dat_tidy %>% slicedt()

2 Foundational Data

Here we produce the data that breathes life into the rest of the execution and experimentation in this paper.

2.1 Orderlines

Orderlines contains the individual line items making up the transactions.

dat_orderlines <- dat_tidy %>% 
    arrange(sale_date, desc(price)) %>% 
    select(sale_date, item_group, item_name, 
           quantity, price,
           customer_name, order_id) 

dat_orderlines %>% slicedt()

2.2 Orders

Orders is an aggregate view on Orderlines using the Order ID. It effectively shows what was purchased in that order. I’ve even fashioned a “receipt”.

dat_orders <- dat_orderlines %>% 
    group_by(sale_date, customer_name, order_id) %>% 
    mutate(receipt = paste0(item_name)) %>% 
    summarise(
        order_price = sum(price),
        order_quantity = sum(quantity),
        order_receipt = toString(unique(receipt)),
        .groups = "drop") 

dat_orders %>% slicedt()

2.3 Customers

This is where the consumers live and will be the backbone of the RFM analysis.

dat_customers <- dat_orderlines %>% 
    group_by(customer_name) %>% 
    mutate(items = paste0(item_name)) %>% 
    summarise(
        total_spent = sum(price),
        total_items = sum(quantity),
        total_orders = n_distinct(order_id),
        first_purchase = min(sale_date),
        last_purchase = max(sale_date),
        receipt = toString(unique(items))) %>% 
    ungroup() %>% 
    mutate(
        analysis_date = ymd("20220101"),
        tenure =  analysis_date - first_purchase,
        recency = as.integer( last_purchase - analysis_date ) ,
        monetary = total_spent,
        frequency = total_orders,
        total_spent = total_spent %>% scales::dollar(accuracy = 1),
        r = ntile(x = recency, n = 5),
        f = ntile(x = frequency, n = 5),
        m = ntile(x = monetary, n = 5),
        rfm_score = paste(r,f,m),
        rfm_sum = as.integer(r)+as.integer(f)+as.integer(m),
        rfm_persona = case_when(
            between(r,3,5) & between(f,3,5) ~ "Champion",
            between(r,1,2) & between(f,3,5) ~ "Can't Lose",
            between(r,4,5) & between(f,1,2) ~ "New",
            between(r,3,3) & between(f,1,2) ~ "Promising",
            between(r,1,2) & between(f,2,2) ~ "At Risk",
            between(r,1,2) & between(f,1,1) ~ "Lost",
            T ~ ""),
        rfm_persona_rev = rfm_persona) %>% 
    select(customer_name, rfm_persona, rfm_persona_rev,
           r, recency, f, frequency, m, monetary, receipt,
           contains("rfm_"), contains("total_"), everything()) %>% 
    mutate(recency = -1*recency)

dat_customers$rfm_persona <- 
    factor(dat_customers$rfm_persona, 
           levels = c("Champion", "Can't Lose", "New", 
                      "Promising","At Risk", "Lost")) 

dat_customers$rfm_persona_rev <- 
    factor(dat_customers$rfm_persona_rev, 
         levels = c( "Lost", "At Risk", "Promising", 
                     "New", "Can't Lose","Champion" )) 

dat_customers <- 
    dat_customers %>% 
    arrange(rfm_persona,
            desc(frequency), desc(monetary), desc(recency))

dat_customers %>% slicedt()

3 Revenue over Time

3.1 Past n Days (n=7, 28, 120)

Diving into the recent performance of Bloom Daddy. We use 7, 28, and 120 day intervals as they reduce seasonality seen in other models. To elaborate, many people use 30 days, but this can add volatility to your numbers as you’re accounting for 4 weeks and 2 days vs. a clean 4 weeks. If you experience varied productivity throughout the week, the 30 days can hurt you depending on the seasonality.

dat_past_performance <- dat_orderlines %>% 
    mutate(
        analysis_date = ymd("20220101"),
        ddays = sale_date - analysis_date) %>% 
    mutate(
        ddays = ddays %>% as.integer(),
        flag_past7 = case_when(ddays >= -7 ~ 1, T ~ 0),
        flag_past28 = case_when(ddays >= -28 ~ 1, T ~ 0),
        flag_past120 = case_when(ddays >= -120 ~ 1, T ~ 0),
        flag_past365 = case_when(ddays >= -365 ~ 1, T ~ 0)
        ) %>% 
        select(sale_date, item_name, item_group, price, contains("past"))

dat_pp7 <- dat_past_performance %>% 
    filter(flag_past7 == 1) %>% 
    summarise(total_spend = sum(price)) %>% 
    mutate(phase = "past7")

dat_pp28 <- dat_past_performance %>% 
    filter(flag_past28 == 1) %>% 
    summarise(total_spend = sum(price)) %>% 
    mutate(phase = "past28")

dat_pp120 <- dat_past_performance %>% 
    filter(flag_past120 == 1) %>% 
    summarise(total_spend = sum(price)) %>% 
    mutate(phase = "past120")

dat_pp365 <- dat_past_performance %>% 
    filter(flag_past365 == 1) %>% 
    summarise(total_spend = sum(price)) %>% 
    mutate(phase = "past365")

dat_pp7 %>% rbind(dat_pp28) %>% rbind(dat_pp120) %>% rbind(dat_pp365) %>% 
    pivot_wider(names_from = phase, values_from = total_spend) %>% 
    mutate_all(scales::dollar) %>% 
    tablekable()
past7 past28 past120 past365
$1,796 $7,896 $33,384 $229,536

3.2 Week-over-Week (WoW)

3.2.1 Data

viz_timelineW <- dat_orderlines %>% 
    mutate(sale_week = FLOOR_WEEK(sale_date)) %>% 
    group_by(sale_week) %>% 
    summarise(total_spend = sum(price)) %>% 
    ggplot(aes(x = sale_week, 
        y = total_spend, 
        fill = total_spend,
        text = str_glue("Date: {sale_week}
                         Revenue: {scales::dollar(total_spend)}"))) +
    geom_smooth() +
    geom_col() +
    theme_ipsum() +
    labs(
        title = "Sales over Time",
        subtitle = "",
        caption = "",
        x = "Week",
        y = "Total Spend",
        fill = "Total Spend"#,
       # color = "Persona"
       ) +
    viridis::scale_fill_viridis(direction = 1, discrete = FALSE) +
    theme(
      legend.position = "bottom",
#     axis.text = element_blank(),
#     axis.text.x = element_text(angle = 90),
#     axis.title = element_blank(),
#      panel.grid = element_blank(),
     plot.margin = margin(.5, .5, .5, .5, "cm")
) 

3.2.2 Interactive

plotly::ggplotly(viz_timelineW, tooltip = "text")

3.2.3 Static

viz_timelineW

3.3 Top Items

3.3.1 Data

stat_items <- dat_orderlines %>% 
    group_by(item_group, item_name) %>% 
    summarise(n_items = n(),
              n_orders = n_distinct(order_id),
              sum_price = sum(price)
              ) %>% ungroup() %>% 
    mutate(
        revenue = sum_price %>% scales::dollar(accuracy = 1,big.mark = ","),
        perc_total = sum_price/sum(sum_price),
        percent_total = perc_total %>% scales::percent(accuracy = 0.01),
        rankv = -perc_total
    ) %>% select(item_group, item_name, 
                 n_items, n_orders, revenue, percent_total, 
                 everything()) %>% 
    arrange(desc(sum_price),n_orders, n_items) %>% 
    mutate(rank = rank(rankv))
## `summarise()` has grouped output by 'item_group'. You can override using the
## `.groups` argument.
viz_items <- stat_items %>% 
    filter(perc_total >= 0.01) %>% 
    ggplot(aes(x = item_name %>% fct_reorder(sum_price), 
               y = sum_price, 
               fill = sum_price,
               text = str_glue(" 
                               Name: {item_name}
                               Revenue: {revenue}
                               % Revenue: {percent_total}
                               Items Sold: {n_items}
                               Total Orders: {n_orders}"))) +
    geom_col()+
    theme_ipsum() +
  #  ylim(0, 750) +
    coord_flip()  +
    labs(
        title = "Top Performing Products",
        subtitle = "Total Spend vs Product",
        caption = "",
        x = "Product",
        y = "Total Spend",
        fill = "Total Spend") +
    viridis::scale_fill_viridis(direction = -1, 
                                discrete = FALSE, 
                                option = "plasma") +
    theme(
      legend.position = "bottom",
      panel.grid = element_blank(),
      plot.margin = margin(.5, .5, .5, .5, "in")
) 

3.3.2 Interactive

viz_items %>% plotly::ggplotly(tooltip = "text")

3.3.3 Static

viz_items

3.4 Top Product Categories

3.4.1 Data

stat_groups <- dat_orderlines %>% 
    filter(item_group != "Heat Pack") %>% 
    group_by(item_group) %>% 
    mutate(receipt = paste0(item_name)) %>% 
    summarise(items_sold = toString(unique(receipt)),
              n_items = n(),
              n_orders = n_distinct(order_id),
              sum_price = sum(price)
              ) %>% ungroup() %>% 
    mutate(
        revenue = sum_price %>% scales::dollar(accuracy = 1),
        perc_total = sum_price/sum(sum_price),
        percent_total = perc_total %>% scales::percent(accuracy = 0.01),
        rankv = -perc_total
    ) %>% select(item_group, items_sold, 
                 n_items, n_orders, revenue, percent_total,
                 everything()) %>% 
    arrange(desc(sum_price),n_orders, item_group) %>% 
    mutate(rank = rank(rankv))

viz_groups <- stat_groups %>% 
    filter(rank <= 10) %>% 
    ggplot(aes(x = item_group %>% fct_reorder(sum_price), 
               y = sum_price, 
               fill = sum_price,
               text = str_glue(" 
                               Group: {item_group}
                               Revenue: {revenue}
                               % Revenue: {percent_total}
                               Products Sold: {n_items}
                               Total Orders: {n_orders}
                               Products Sold: {items_sold}"
               ))) +
    geom_col()+
    coord_flip() +
    theme_ipsum()  +
    labs(
        title = "Top Performing Product Groups",
        subtitle = "Total Spend vs Product Groups",
        caption = "",
        x = "Product Group",
        y = "Total Spend",
        fill = "Total Spend") +
    theme(
      legend.position = "bottom",
      plot.margin = margin(.5, .5, .5, .5, "cm")) +
    viridis::scale_fill_viridis(direction = -1, 
                                discrete = FALSE, 
                                option = "plasma") +
    theme(
      legend.position = "bottom",
      panel.grid = element_blank(),
      plot.margin = margin(.5, .5, .5, .5, "in")
) 

3.4.2 Interactive

viz_groups %>% plotly::ggplotly(tooltip = "text")

3.4.3 Static

viz_groups

4 RFM Customer Segmentation

An RFM Analysis is built on three components:

  1. Recency - the days since a consumer last purchased.
  2. Frequency - the total number of orders purchased.
  3. Monetary - the total revenue generated by the consumer.

By using these facets, we can segment our data base and prescribe strategies tailored to the user’s behavior. For instance, a user with high monetary and frequency values, but low recency can be targeted with marketing to intervene and prevent them from lapsing.

4.1 Persona Details

Personas use Recency, Frequency, and Monetary in a 3D-fashion to help articulate tactics for the different consumers present. Based on our definition, we have six options total: Champions, Loyal, Recent, High Potential, Needs Nurturing, and Inactive. Unfortunately, we were yet to produce Champions as they need consistently high marks across all three facets.

dat_customers %>% 
    group_by(rfm_persona) %>% 
    summarise(n = n(),
              min_r = min(recency),
              max_r = max(recency),
              min_f = min(frequency),
              max_f = max(frequency),
              min_m = min(monetary),
              max_m = max(monetary),
              .groups = "drop"
              ) %>% ungroup() %>% 
    mutate(perc_total = n/sum(n)) %>% 
    mutate(perc_total = scales::percent(perc_total)) %>% 
    select(rfm_persona, n, perc_total, everything()) %>% tabledata()

4.2 Visualizations

4.2.1 Recency

 viz_recency <- dat_customers %>% 
    ggplot(aes(x = rfm_persona_rev, 
               y = recency, 
               fill = rfm_persona_rev
               )) +
    ggdist::stat_halfeye(aes(color = rfm_persona_rev),
        adjust = 0.5, 
        justification = -.2, 
        .width = 0,
        point_colour = NA, 
        na.rm = T) +
    geom_boxplot(aes(color = rfm_persona_rev),
        width = .4, 
        outlier.color = NA, 
        outlier.alpha =  0.33,
        alpha = 0.66, 
        na.rm = T) +
    ggdist::stat_dots(aes(color = rfm_persona_rev),
         side = "left", 
         justification = 1.1, 
         binwidth = .5) +
    theme_ipsum() +
    ylim(0, 365) +
    coord_flip()  +
    labs(
        title = "Persona Distributions: Recency",
        subtitle = "Recency Raincloud",
        caption = "Here we can see the persona definitions more clearly as the different groups are heavily driven by recency.",
        x = "Persona",
        y = "Recency",
        fill = "Persona",
        color = "Persona") +
    viridis::scale_fill_viridis(direction = -1, 
                                discrete = TRUE, 
                                option = "plasma") +
    viridis::scale_color_viridis(direction = -1, 
                                discrete = TRUE, 
                                option = "plasma") +
    theme(
      legend.position = "bottom",
      panel.grid = element_blank(),
      plot.margin = margin(.5, .5, .5, .5, "in")
) 

viz_recency

4.2.2 Frequency

 viz_frequency <- dat_customers %>% 
    ggplot(aes(x = rfm_persona_rev, 
               y = frequency, 
               fill = rfm_persona_rev
               )) +
    ggdist::stat_halfeye(aes(color = rfm_persona_rev),
        adjust = 0.5, 
        justification = -.2, 
        .width = 0,
        point_colour = NA, 
        na.rm = T) +
    geom_boxplot(aes(color = rfm_persona_rev),
        width = .4, 
        outlier.color = NA, 
        outlier.alpha =  0.33,
        alpha = 0.66, 
        na.rm = T) +
    # ggdist::stat_dots(aes(color = rfm_persona_rev),
    #      side = "left", 
    #      justification = 1.1, 
    #      binwidth = .01) +
    theme_ipsum() +
    ylim(1,5) +
    coord_flip()  +
    labs(
        title = "Persona Distributions: Frequency",
        subtitle = "Frequency Raincloud",
        caption = "Here we see CHAMPIONS and CAN'T LOSE reaching into the higher frequency tiers via repeated purchases",
        x = "Persona",
        y = "Monetary",
        fill = "Persona",
        color = "Persona") +
    viridis::scale_fill_viridis(direction = -1, 
                                discrete = TRUE, 
                                option = "plasma") +
    viridis::scale_color_viridis(direction = -1, 
                                discrete = TRUE, 
                                option = "plasma") +
    theme(
      legend.position = "bottom",
      panel.grid = element_blank(),
      plot.margin = margin(.5, .5, .5, .5, "in")
) 

viz_frequency

4.2.3 Monetary

 viz_monetary <- dat_customers %>% 
    ggplot(aes(x = rfm_persona_rev, 
               y = monetary, 
               fill = rfm_persona_rev
               )) +
    ggdist::stat_halfeye(aes(color = rfm_persona_rev),
        adjust = 0.5, 
        justification = -.2, 
        .width = 0,
        point_colour = NA, 
        na.rm = T) +
    geom_boxplot(aes(color = rfm_persona_rev),
        width = .4, 
        outlier.color = NA, 
        outlier.alpha =  0.33,
        alpha = 0.66, 
        na.rm = T) +
    ggdist::stat_dots(aes(color = rfm_persona_rev),
         side = "left", 
         justification = 1.1, 
         binwidth = .5) +
    theme_ipsum() +
    ylim(0, 750) +
    coord_flip()  +
    labs(
        title = "Persona Distributions: Monetary",
        subtitle = "Monetary Raincloud",
        caption = "Although we see CAN'T LOSE and CHAMPIONS share a broad range, note how much higher their average spend is.",
        x = "Persona",
        y = "Monetary",
        fill = "Persona",
        color = "Persona") +
    viridis::scale_fill_viridis(direction = -1, 
                                discrete = TRUE, 
                                option = "plasma") +
    viridis::scale_color_viridis(direction = -1, 
                                discrete = TRUE, 
                                option = "plasma") +
    theme(
      legend.position = "bottom",
      panel.grid = element_blank(),
      plot.margin = margin(.5, .5, .5, .5, "in")
) 

viz_monetary

5 Conclusion

6 Appendix

6.1 Persona Ranges

6.1.1 Deep Dive

dat_customers %>% 
    group_by(rfm_persona,rfm_score,r,f,m) %>% 
    summarise(n = n(),
              min_r = min(recency),
              max_r = max(recency),
              min_f = min(frequency),
              max_f = max(frequency),
              min_m = min(monetary),
              max_m = max(monetary),
              .groups = "drop"
              ) %>% ungroup() %>% 
    mutate(perc_total = n/sum(n)) %>% 
    mutate(perc_total = scales::percent(perc_total)) %>% 
    select(rfm_persona, rfm_score, r,f,m,n, perc_total, 
           everything()) %>% tabledata()

6.1.2 Recency

dat_customers %>% 
    group_by(rfm_persona,r) %>% 
    summarise(n = n(),
              min_r = min(recency),
              max_r = max(recency),
              min_f = min(frequency),
              max_f = max(frequency),
              min_m = min(monetary),
              max_m = max(monetary),
              .groups = "drop"
              ) %>% ungroup() %>% 
    mutate(perc_total = n/sum(n)) %>% 
    mutate(perc_total = scales::percent(perc_total)) %>% 
    select(rfm_persona, r,n, perc_total, everything())  %>% tabledata()

6.1.3 Frequency

dat_customers %>% 
    group_by(rfm_persona,f) %>% 
    summarise(n = n(),
              min_r = min(recency),
              max_r = max(recency),
              min_f = min(frequency),
              max_f = max(frequency),
              min_m = min(monetary),
              max_m = max(monetary),
              .groups = "drop"
              ) %>% ungroup() %>% 
    mutate(perc_total = n/sum(n)) %>% 
    mutate(perc_total = scales::percent(perc_total)) %>% 
    select(rfm_persona,f, n, perc_total, everything())  %>% tabledata()

6.1.4 Monetary

dat_customers %>% 
    group_by(rfm_persona,m) %>% 
    summarise(n = n(),
              min_r = min(recency),
              max_r = max(recency),
              min_f = min(frequency),
              max_f = max(frequency),
              min_m = min(monetary),
              max_m = max(monetary),
              .groups = "drop"
              ) %>% ungroup() %>% 
    mutate(perc_total = n/sum(n)) %>% 
    mutate(perc_total = scales::percent(perc_total)) %>% 
    select(rfm_persona, m, n,
           perc_total, everything()) %>% tabledata()

6.2 RMarkdown Syntax

This is what plain text will look like (use two spaces after the line to create a new paragraph).
This is what plain text will look like (use one < br > to create a small page break).
This is what plain text will look like (use two < br > to create a small page break).

This is what plain text will look like (use three < br > to create a medium page break).


italics and italics via wrapping * or _ around a body.

bold and bold via double wrapping * * or _ _ around a body.

superscript2 via wrapping ^ around a body.

strikethrough via double wrapping ~ around a body.

link_label via wrapping [ around the link_label and ( around the link. This works for cross-referencing.

endash: – via double dash.

emdash: — via triple dash.

ellipsis: … via three, consecutive periods.

inline equation: \(A = \pi*r^{2}\) via wrapping $ around the body.

horizontal rule (or slide break): *** > block quote with a lot of words and even some text here

  • unordered list
  • item 2
  • sub-item 1
  • sub-item 2

  1. ordered list
  2. item 2
  • sub-item 1
  • sub-item 2 Table Header | Second Header ————- | ————- Table Cell | Cell 2 Cell 3 | Cell 4 image: periwinkle monstera gif: gif2